perf: bulk text block scanner bypasses fastparse per-line overhead#689
Open
He-Pin wants to merge 1 commit intodatabricks:masterfrom
Open
perf: bulk text block scanner bypasses fastparse per-line overhead#689He-Pin wants to merge 1 commit intodatabricks:masterfrom
He-Pin wants to merge 1 commit intodatabricks:masterfrom
Conversation
Replace the per-line fastparse combinator loop in tripleBarStringBody with a custom bulk scanner that directly accesses the underlying String data. For a 600KB text block with ~8000 lines, this eliminates ~8000 intermediate String allocations and the Seq[String] + mkString join overhead. Key changes: - tripleBarStringBodyBulk: Custom scanner using IndexedParserInput.data for zero-copy StringBuilder.append(CharSequence, start, end) instead of fastparse's repX combinator which creates one String per line. - Hybrid approach: first line still uses fastparse for proper error messages, subsequent lines use the bulk scanner. - constructString: Skip string interning for strings >1024 chars (avoids expensive hashCode computation on 600KB strings), single-string fast path, pre-sized StringBuilder for multi-line blocks. - Falls back to original fastparse path for non-IndexedParserInput. JMH large_string_template: 2.251 → 1.762 ms/op (-21.7%) Native large_string_template: ~37% faster Upstream: explored in he-pin/sjsonnet jit branch
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Text blocks (
|||syntax) are parsed line-by-line through fastparse, which incurs per-line combinator overhead for each newline. Programs with large text blocks (templates, embedded configs) pay this cost unnecessarily.Key Design Decision
Implement a bulk scanner that directly scans for the text block terminator (
|||) using a simple character loop, bypassing the fastparse per-line combinator overhead entirely. The scanner processes the entire text block in a single pass.Modification
|||terminator without per-line fastparse dispatchBenchmark Results
JMH (JVM, 3 iterations warmup + 3 measurement)
Analysis
The improvement is modest but consistent across all benchmarks. The benefit will be larger for programs with many or large text blocks. Since parsing is typically a small fraction of total eval time, the -5.7% to -17.6% range is expected.
References
Result
All 46 tests pass. All benchmarks positive, no regressions.